arXiv:l503.06318vl [cond-mat.stat-mech] 21 Mar 2015 


From statistics of regular tree-like graphs to distribution function and gyration 

radius of branched polymers 

Alexander Y. Grosberg 1,2 , and Sergei K. Nechaev 3,4 
1 Physico-Chimie Curie UMR 168, Institut Curie, 

PSL Research University, 26 rue d’Ulm, 7 5248 Paris Cedex 05, France 
2 Department of Physics and Center for Soft Matter Research, 

New York University, 4 Washington Place, New York, NY 10003, USA 
3 Universite Paris-Sud/CNRS, LPTMS, UMR8626, Bat. 100, 91405 Orsay, France 
4 P.N. Lebedev Physical Institute of the Russian Academy of Sciences, 119991, Moscow, Russia 

(Dated: March 24, 2015) 

We consider flexible branched polymer, with quenched branch structure, and show that 
its conformational entropy as a function of its gyration radius R, at large R, obeys, in the 
scaling sense, AS ~ R 2 /(a 2 L), with a bond length (or Kuhn segment) and L defined as an 
average spanning distance. We show that this estimate is valid up to at most the logarithmic 
correction for any tree. We do so by explicitly computing the largest eigenvalues of Kramers 
matrices for both regular and “sparse” 3-branched trees, uncovering on the way their peculiar 
mathematical properties. 

PACS numbers: 02.50.-r; 61.25.hp 


I. INTRODUCTION 


We are interested in statistical mechanics of long flexible branched objects. The leading example, 
but certainly not the only one, is large RNA molecule. By means of covalent bonds acting between 
nucleotides along a chain and non-covalent saturating bonds between complementary bases, RNA 
molecule folds in a peculiar secondary structure which is effectively a branched polymer. There is 
an enormous literature on RNA, and on branched polymers in general, however we will be interested 
in a specific question which has not been sufficiently considered so far: what is the conformational 
entropy of a branched polymer in a tree-dimensional space, and how does it depend on the overall 
spatial span (e.g., on the gyration radius), and on the specific arrangement of branches. To make 
it clear: we imagine a branched polymer, such as a secondary structured RNA, to flex its bonds 
and junctions without re-arranging secondary structure itself, which we will refer to as quenched 
branched polymer; we will be interested in entropy associated with spatial conformations of this 
quenched object. 

Conformational entropy of this type was first written down by Daoud et al [tJ, as A F/T ~ 
R 2 /Rq, where R and Rq have been denoted as “actual” and “unperturbed” gyration radii, respec¬ 
tively. In the work [3], authors examined the influence of excluded volume interactions on the 
swelling of branched polymers and on the properties of branched polymers in the solutions, so for 
them R was the average size of a self-avoiding polymer, while Rq was considered as ideal. Actually, 
Rq was taken as Rq alV 1 / 4 , i.e., as a mean size of an ideal tree js|, where averaging was supposed 
to include both spatial fluctuations of a given tree, and all rearrangements of the tree branches 
(i.e., it was considered an annealed tree). It is this double average that leads to the well known 
Zimm-Stockmayer law, Rq ~ alV 1 / 4 , for ideal branched polymers [6|]. 

Later on, a similar looking expression, 


A F/T ~ R 2 / (a 2 L) 


(1) 
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was employed in a more subtle context, emphasizing the difference between quenched and annealed 
branched polymers [g|. In this case, o?L replaces Rq. Obviously, in this sense, L can be thought 
as a length of a Gaussian polymer of a typical mean squared size ~ Rq. More meaningfully, L was 
interpreted as an order parameter, the average spanning distance of a tree. This theory (applied 
in a variety of contexts @0,0) implies the necessity to consider conformational entropy of a 
branched polymer with fixed (quenched) branches, characterized by L. From this point of view, 
the free energy estimate A F/T ~ R 2 /(a 2 L) may seem suspicious. Indeed, if L is the spanning 
distance of the tree, we can imagine selecting one particular line of exactly L bonds in the tree 
(call it a tree trunk), and then R 2 /(a 2 L ) is (or seems to be?) the free energy of this tree trunk 
viewed as a linear polymer. In fact, to relate this to the free energy of a tree, two aspects have 
to be taken into account: first, when the tree is stretched (or swollen), then all its branches are 
stretched, not only the trunk; second, if R is the gyration radius of the tree, then gyration radius of 
the trunk is somewhat smaller than R (because branches provide a lot of mass close to the center). 
This leads us to the question: Is the free energy estimate, A F/T ~ R 2 /(a 2 L ) valid for every tree, 
with L defined as an average spanning distance, at least in the scaling sense? In this paper, we set 
out to address this question, and our answer is positive: we will show that this estimate of entropy 
is valid for any tree, up to at most the logarithmic correction. 


We build our arguments based on the so-called Kramers theorem [l2| (see also @]) and its 
recent generalization found in @. These statements can be formulated as follows. Let us label 
all the bonds of the tree (in arbitrary order) with the index x, 1 < x < N — 1, where N is the 
number of monomers (tree vertices). For a Gaussian system (without excluded volume, and when 
“bonds” are long enough compared to the persistence length), each bond vector, a, is normally 
distributed, ~ e -3a / 2a , with the mean squared bond length a 2 = (a 2 ), where the vectors a x and 
a y are independent for any x ^ y. Let us further define the (N — 1) x (N — 1) matrix G with 
the entries G X)V = M(x)M(y)/N 2 , where M(x ) and M(y) are the numbers of tree vertices on the 
one side of bond x, while M(y) is the number of vertices on the other side of bond y - see the 
illustration in the Fig{T]and more detailed discussion in the Section II. In terms of this matrix the 
Kramers theorem reads: 


Kramers theorem. The averaged gyration radius of the tree is given by the trace tr G: 

N -1 

(R 2 /a 2 ) = E G x , x = ^ E M ^ N - M (*)) • ( 2 ) 

X x=l 

Here and below, (...) means quenched average, i.e., average over all spatial conformations 
with fixed tree structure. 

Generalization of Kramers theorem. For R 3> {R), the probability P(R) of a given branched 
polymer (with quenched tree structure) to have a gyration radius R, up to power law cor¬ 
rections, goes as 

^(^)l^oo~ e_3i?2/(2a2Amax) , (3) 

where A max is the largest eigenvalue of the matrix G. In other words, the free energy price 
for swelling a quenched tree to a large size R, up to logarithmic corrections, goes as 

A F/T ~ 3R 2 / (2a 2 A max ) . (4) 

The former statement (the Kramers theorem itself) can be also found in the textbook [ 2 ], while 
the latter statement (its generalization) was proven recently in [§]. To make the present work 
self-contained, we reproduce the derivation in the Appendix [Aj 
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Figure 1: Towards the definition of matrix G x , y ■ On the tree, the bonds x and y naturally divide all nodes 
of the graph into three categories: on one side of x (shaded), on the other side of y (also shaded), and in 
between (not shaded). We denote the numbers of monomers in the former two categories as M(x) and M(y). 
Each matrix element is given by G XtV = M (x) M(y) /TV 2 . In this particular example G XtV = 3 x 7/22 2 . 


The result ([!]) of the generalized Kramers theorem does not explain the relation between A max 
and the internal geometry/topology of the tree itself. In this paper we will establish the following: 

• For a regularly p = 3-branching tree (a “3-dendrimer”) A max = c tree with ct re e ~ 0.957; 

• For a regular sparse tree, where each bond is a linear polymer of length s, A max = sct ree ; 

• For a linear polymer (or palm tree, with trunk, but without branches), A max = N/n 2 -, 

• We provide an interpolation of A max from “dense” (s = 1) to “sparse” IV > s 3> 1 (almost 
palm) trees in the form A max ~ cs , where c is another constant of order of unity. 

• Based on the above examples, we conclude that in general, for any tree, A max = cL, where 
L is the average spanning distance of the tree, and factor c is at most of the order of In N. 

The last statement makes the bridge between equations (Q]) and (J4]) , thus proving the former. 


II. LARGEST EIGENVALUE OF KRAMERS MATRIX FOR REGULAR TREES 

A. Trees with branchings at every node 


To get an explicit form provided by the estimate (pfj) . we computed the highest eigenvalue of 
adjacency matrix G of a regular symmetric tree. For simplicity we consider 3-branching trees 
only, however results can be easily generalized for trees with any branching number, p. To specify 
the system, suppose that the tree has k generations, and the total number of vertices is N^. It is 
convenient to represent a tree by descending levels (generations) as shown in the Fig[2j The distance 
from the root point, O, is counted in the number of levels. In order to construct the adjacency 
matrix G corresponding to a tree, we should enumerate somehow the tree links. Apparently, the 
most straightforward and naive is the sequential enumeration of links in each level. Namely, we 
start with a left-most link in the 1st level and enumerate all links in this generation left-to-the-right, 
then proceed in the same way with the next level, etc. Thus, all links are enumerated sequentially 
by natural number x, where x = 1, 2,..., 3 x 2 k ~ l . 
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(a) (b) 

Figure 2: (a) The example of the 3-branching finite tree of k = 3 levels (generations). Vertices are enumerated 
by natural numbers sequentially left-to-right in descending order. The shadowed areas mark the subtrees 
associated with links x = 5 and x = 3. The mass of the subtree, M(x), is counted in the number of vertices 
in this subtree. The matrix element, G x , y of the adjacency matrix G is the product of masses for links x and 
y\ (b) An example of “sparse” 3-branching tree of k = 3 levels. Everything is the same as in (2), however 
each branching point of the sparse tree is separated by the subchain of s (in the figure (b) s = 2) links. The 
tree links are enumerated sequentially first in linear subchains and then - as in (a). 


Let us repeat the construction of the adjacency matrix G = {G x , y } of the regular tree in the 
descent diagram representation (Figj^j). Define the “mass,” M{x), associated with the link x as the 
sum of all vertices in the subtree which begins with the link x. The matrix element G x>y = G V)X is 
the non-normalized product of masses M(x) and M(y ) for links x and y correspondingly: G x , y = 
M(x)M(y)/N In the Figj2]we have demonstrated the construction of the element G 3,5 = G 5,3 
for the adjacency matrix of size IV3 x IV3 = 22 x 22. The explicit form of the matrix G for the tree 
shown in the Figj2ji, is given in the Figf3jr. It is instructive to have a dictionary with necessary 
definitions: 


• The distance from the root point on the tree, counted in the number of levels, is denoted by 
j, the total number of levels in the tree is k; 

• The total number of vertices, N &, of the k- level 3-branching tree is Nk = 3 x 2 k — 2; 

• The number of vertices, Nj, in the subtree starting from some link in the level j, is Njk = 
2 fc -j + 1 — 1 . where 1 > j < k. (In the Fig{2^i the subtrees with Nj— 2,^=3 = 3 and Nj = 1^=3 = 7 
are shown). The “mass,” M, of the subtree is counted in the number of vertices of this 
subtree. 


The generic structure of the square x adjacency matrix G with components G X)V = 
N 2 G X) y = M(x)M(y), (where Nk = 3 x 2 k — 2), is schematically shown in the Figl3}). 

Let us denote by = jt/^,..., j the eigenvector s of the matrix G corresponding to the 
eigenvalue X s (1 < s < Nk), some of eigenvalues and eigenvectors may be degenerated. We noticed 
that due to the symmetry properties of the adjacency matrix G, its leading (maximal) eigenvector 
f/max has the form 


level 1 (3 elements) level 2 (3x2 elements) 


level k (3x2 fc 1 elements) 


-jj-max _ I jymax 




C/r ax ; 


t rmax 
U 9 i 


C/ 9 max 


t rmax 


F max 


(5) 


Plugging this ansatz into the equation GU max = A max t/ max , we find the leading eigenvalue A max to 
be also the leading eigenvalue of an exponentially smaller matrix B: 


= A 


max 


v 


max 


5V max 


( 6 ) 
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(a) 



Figure 3: (a) The example of the adjacency matrix G corresponding to the tree in the Fig[2] The matrix G 
clearly displays a hierarchical structure. By colors we highlight the principal repetitive sequences; (b) The 
generic structure of the adjacency matrix G. 


B = 


= {t/f 13 *, 

rrmax 
u 2 5 

rrmax 

u k 

}, and 

bn 

bu 

bi3 

bn ■ ■ 

• b\k ^ 

b\2 

b22 

&23 

f>24 • • 

• b 2 k 

bi3 

4 

^23 

~Y 

^33 

634 . . 

■ b3k 

&14 

^24 

^34 

644 

bik 

8 

4 

2 

bik 

& 2 fc 

b3k 

bik 

^kk J 

k —1 

2 k ~ 2 

2 k ~ 3 

2 fc-4 


bij = 2 _1_2i (2 3+fc - 3 x 2 2+i+k + 3 x 2 2i ) 


2 i - 2 1+k 
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(7) 


(recall that = 3 x 2 k — 2). Unlike G, which is x N^, i.e., exponentially large, the matrix B 
is only k x k. 

In the FigQJqb we have plotted the matrix elements of B for k = 5, 30 in form of a 3D relief Bij 
over the base plane where (i,j) = 1, ...,k. Computing numerically the maximal eigenvalues 

of the true matrices B defined in <[7j), for different k, we get a plot shown in the FigHt. 

Extrapolating the data shown in Figj4]to k oo, we obtain 

Amax ~ 0.957 (8) 

meaning that the largest eigenvalue, A max of the matrix B and, consequently, of the initial Kramers 
matrix G, is bounded from above by some numeric value independent on the matrix size (and 
without any logarithmic corrections). 

It is noteworthy that a rough estimate of the eigenvalue A max can be done using the following 
simple analytical argument. All matrix elements Bij are positive and 


— 


u h3 

hi*- j 


j > i 

j < i 
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Figure 4: Visualization of the matrix B as a 3D plot of Bij over the base (i,j) for: (a) k = 5, (b) k = 30; 
(c) Dependence on the maximal eigenvalue of the matrix B (see 0) of size k x k on k. The limiting value 
of A max tends to « 0.957 as k —» oo. 


Taking into account the explicit form of matrix elements J for j > i (see (0), we can approximate 
each element Bi j for any (i,j) by bij. This approximation gives exact value of Bij for j > i, and 
provides an upper estimate of elements B t J for j < i. 

The maximal eigenvalue, A max , of the k x k matrix B with entries {bij} can be computed 
exactly. Eq.0 means that the element bij can be factorized as 


bij = fi gj ] fi = 2 - 1 - 2 *(2 3+fc - 3 x 2 l+l+K + 3 x 2*); 9j = 


2+i+k 


-)2z\ 


2? - 2 1+k 
(3 x 2 k - 2 ) 2 


(9) 


Now, the largest eigenvalue, A max , reads 


Amax — ^ ^ fm Qm — 

m=l 


7 x 2 2 +2fc _ 27 k x 2 k - 15 x 2 k - 13 
3(3 x 2 k - 2 ) 2 


( 10 ) 


For k —> oo on gets from (fTOl) 


Ar, 


28 


1.037 


k—> oo 27 

which is a pretty good approximation for numerically found true limiting value A max « 0.957. 


(ii) 


B. Sparse regular trees 

We can generalize our approach to “sparse” regular trees, in which branchings are connected 
by linear subchains of s links. By definition, for s = 1 we return to the former case of branching 
at every node of the tree. Particular example of a dendrimer of N = 19 nodes with 3-branching 
vertices separated by 2 -link subchains (i.e. s = 2 ), is shown in the FigI2b- 

To construct the Kramers adjacency matrix, G^ s \ for such a tree, one proceeds as above, using 
the descent diagram representation, where we first enumerate sequentially links in each linear 
subchain, and then pass to the next subchain in the same level (compare Figj2ji and Fig{2)o) . The 
matrix G constructed in such a way, highly resembles the former one, G, where each matrix 
element G XtV is replaced by a s x s-block (a 2 x 2 block for a particular example of FiglJJa). 

The largest eigenvalue, A)JU(s), of the composite adjacency matrix, G ^, can be estimated as 

C(>)» a£5 (12) 
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where A^ x (s) is the maximal eigenvalue of a Kramers matrix for a linear chain of s links and A max 
is given by The value of A[j 1 I l lx (s) reads: 


\ lin 
^max 


(-) 


s + 1 
4s 2 


sin 


7r 


A!i n ax («) 


2(s + 1) ’ 


Amax(' 1 ’ = 1) = 1 
Amaxl 5 > !) = ^ 


(13) 


As it follows from (|5|). the value A max can be approximated by some constant, c, of order of unity. 
Thus, we can estimate Aj^| x (s) for all regular sparse trees as 


A»)«cAS 1 ax ( S ) = cs 


(14) 


where c absorbs all numerical constants. 


III. DISCUSSION 


Let us now summarize what we learned about trees. To begin with, recall that for a linear 
polymer the whole spectrum of the Kramers matrix is known [14| (see also small corrections in 
[lit and also [3]); in particular A max = Since spanning distance in this case is just L = N, we 
have A max = -p- ~ L. 

Next, for the perfect 3-branching dendrimer we have established above that A max ~ 1. This 
has to be compared with average spanning distance of this tree. For dendrimer with k generations 
and, accordingly, 3x2 k ends, the sum of distances along a backbone from one end to all other ends 
is equal to 2 + 2 fc+ 1 (3&: + 1). Hence, L = , and at large k this asymptotically tends to 

L ~ 2k. Since N = 3 x 2 k+l — 2, (or, equivalently, k log 2 Y — 1 ~ log 2 N ), we have L ~ 2 log 2 N. 
Therefore, we can say that A max ~ log 2 N. 

Thus, for a regular tree, A max is related to the average spanning distance L via a factor of order 
In Ah Let us show now that this is the worst case, and for all other trees the estimate A max ~ L is 
even more accurate. 

Sparse regular trees provide in this sense a good insight, as they smoothly interpolate between 
regular tress and linear chains. As we have seen, in this case A max « s. On the other hand, in this 
case N = 3s (2 fc+1 — l) + 1, or k = log 2 [ iV+ 3 ^~ 1 ]; at large N this is asymptotic to k ~ log 2 [y]. 
At the same time, L ~ 2 sk or L ~ 2slog 2 [^]. Thus, 

A max ~ ~ log 2 

We see that the factor relating A max to mean spanning distance L smoothly changes between about 
In AT and about 1 as s changes from 1 to N. 

In fact, it is clear physically that a regularly branching tree is the most compact and most 
difficult to swell of all quenched branched polymers, while linear polymer is the least compact and 
the easiest to swell. Since we have rigorously analyzed both of these extremes, we conclude that 
the expression (j3J), which establishes conformational entropy up to logarithmic corrections, implies 
also the validity of the scaling estimate of entropy ([U in terms of average spanning diameter. 
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Appendix A: Generalized Kramers theorem 


Here, we compute the gyration radius probability distribution for a Gaussian tree. The approach 
follows the works of M. Fixrnan 14-1161], and 17]], where it was generalized for Gaussian rings. 


Suppose for simplicity that the branching number is p = 3, and there are only branchings and 
ends, namely, n three-valent vertices and n + 2 ends, totally N = 2n + 2 monomers. If r, are 
position vectors of these “monomers,” then the gyration radius reads 


R 2 


2N 2 




h3 


(Al) 


On the tree, every r* — rj is uniquely represented as the sum of the set of bond vectors aj = r, : / — r^, 
where i and i! are monomers connected by bond k. Accordingly, gyration radius can be represented 
as 


R — ^ ^ Gx^y&x&y, (A2) 

x,y 

where the indices x and y label bonds (unlike i , i' and j above, which label vertices or monomers), 
and G xy is an Kramers ( N — 1) x (A — 1) matrix, illustrated in the FigQ] 

For a Gaussian tree we can do more and find the probability distribution of R 2 , assuming each 
bond has the probability distribution ~ e -3a G a . The characteristic function of R 2 reads 

$(s) = (e isR2 ) = A j ^{a}e-^ 3 a 2/( 2 « a) -N-E-.,G.,« > ( A3 ) 

where the explicit expression for normalization factor A is dropped for brevity. Rotating now the 
coordinate system in this N — 1 dimensional space of {a} to the basis of eigenvectors {b} of matrix 
G , we obtain 

i( s ) = A J = n (l - 2 -Al Xr y 3, \ (A4) 


where X p are eigenvalues, and normalization factor A is reconstructed from the condition < h(s )| s=0 = 
1. From here, finding the probability distribution of R 2 is a matter of inverse Fourier transform of 
$(s) 


P{R 2 ) 


1 

27T 


/ 


dse isR2 


In I 



(A5) 


Since we are interested in the behavior of P{R 2 ) at large R, the asymptotics is controlled 
by the singularity of <h(s) closest to the origin in the complex s-plane, that corresponds to the 
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largest eigenvalue max p [A p ] = A max . In the vicinity of this singularity there is a saddle point which 
dominates the integral, and we can evaluate inverse Fourier transform integral by steepest descent. 
The equation for saddle point location reads R 2 ~ -j— ■ -, or s = i ( — ttA — ) • Thus, 

1 ISyZCL /OjAmax \^ZJri ZCL Amax J 

the result of saddle point integration (up to logarithmic corrections) is 


P(R 2 )\ 


R—too 


- + f ln IT 

r z A n 


0 —3it 2 /2a 2 A n 


(A 6 ) 


yielding the expected generalization of Kramers theorem Q. 
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